l!!MIMll]IMflfH' : mill 

US006457026B1 

(12) United States Patent m Patent No.: US 6,457,026 Bi 

Graham et al. (45) Date of Patent: Sep. 24, 2002 



(54) SYSTEM TO FACILITATE READING A 
DOCUMENT 

(75) Inventors: Jamey Graham, Los Altos; David G. 

Stork, Portola Valley, both of CA (US) 

(73) Assignees: Ricoh Company, Ltd., Tokyo (JP); 

Ricoh Corporation, Menlo Park, CA 
(US) 

( * ) Notice: Subject to any disclaimer, the term of this 
patent is extended or adjusted under 35 
U.S.C. 154(b) by 45 days. 

(21) Appl. No.: 09/661,184 

(22) Filed: Sep. 13, 2000 

Related U.S. Application Data 

(62) Division of application No. 08/995,616, filed on Dec. 22, 
1997. 

(51) Int. CI. 7 G06F 17/21 

(52) VJS. CI 707/512; 707/3; 707/501.1 

(58) Field of Search 707/512, 501.1, 

707/513, 3-6 

(56) References Cited 

U.S. PATENT DOCUMENTS 



4,417,239 A 


11/1983 




340/709 


4,823,303 A 


4/1989 


Terasawa 


364/521 


5,153,831 A 


10/1992 


Ylanilos 


707/531 


5,309,359 A 


* 5/1994 


Katz et a] 


. 364/419.19 


5,349,658 A 


9/1994 


O'Rourke et al. 




5,384,703 A 


1/1995 


Withgott et al. 




5,404,295 A 


4/1995 


Katz et al. 




5,418,948 A 


5/1995 


Turtle 


707/4 


5,442,795 A 


8/1995 


Levine et al. 




5,479,600 A 


12/1995 


Wroblewski et al. 




5,481,666 A 


1/1996 


Nguyen et al. 




5,596,700 A 


1/1997 


Darnell et al 


345/340 


5,638,543 A 


6/1997 


Pedersen et al 


395/751 


5,680,636 A 


10/1997 


Levine et al 


395/800 


5,694,559 A 


* 12/1997 


Hobson et al 


395/336 


5,721,897 A 


* 2/1998 


Rubinstein 


395/602 



5,737,599 A 4/1998 Rowe et al 707A04 

5,748,805 A 5A998 Withgott et at 382/306 

(List continued on next page.) 

FOREIGN PATENT DOCUMENTS 



EP 0378848 A2 7/1990 

EP 0459174 A2 12/1991 

EP 0737927 A2 10/1996 

EP 0762297 A2 3/1997 

EP 0802492 Al 10/1997 

GB 2137788 10/1984 

GB 2156118 10/1985 

GB 2234609 6/1991 

GB 2290898 10/1999 

JP 8297677 A 11/1996 



OTHER PUBLICATIONS 

Lam, Wai and Low, Kon-Fan "Automatic document clas- 
sification based on probabilistic reasoning: model and per- 
formance analysis," pp. 2719-2723, Oct. 1997, Systems, 
Man, and Cybernetics, vol. 3.* 

Begole et al., "Supporting Worker Independence in Collabo- 
ration Transparency," doc. ID: ncstrl.vatech_cs/TR-98-12, 
Virginia Polytechnic Institute and State University (1994). 

(List continued on next page.) 

Primary Examiner— Joseph H. Feild 

(74) Attorney, Agent, or Firm — Townsend and Townsend 

and Crew LLP 

(57) ABSTRACT 

An automatic reading assistance application for documents 
available in electronic form. An automatic annotator is 
provided which finds concepts of interest and keywords. The 
operation of the annotator is personalizable for a particular 
user. The annotator is also capable of improving its perfor- 
mance overtime by both automatic and manual feedback. 
The annotator is usable with any electronic document. 
Another available feature is a thumbnail image of all or part 
of a multi-page document wherein a currently displayed 
section of the document is highlighted in the thumbnail 
image. Movement of the highlighted area in the thumbnail 
image is then synchronized with scrolling through the docu- 
ment. 

23 Claims, 14 Drawing Sheets 




06/10/2004, EAST Version: 1.4.1 



US 6,457,026 Bl 

Page 2 



U.S. PATENT DOCUMENTS 



5,761,655 A 6/1998 Hoffman 

5,778,397 A 7/1998 Kupicc et a!. 

5,781,785 A 7/1998 Rowe ct al 707/513 

5,784,616 A 7/1998 Horvitz 709/102 

5,819,301 A 10/1998 Rowe et a) 707/513 

5,832,474 A 11/1998 Lopresti et al. 

5,838,317 A 11/1998 Bolnick et al. 

5,857,185 A 1/1999 Yamaura 707/5 

5,860,074 A 1/1999 Rowe et al 707/526 

5,870,770 A 2/1999 Wolfe 707/501 

5,873,107 A 2/1999 Borovoy et al 707/501 

5,933,841 A * 8/1999 Schumacher el al 707/501 

5,943,679 A 8/1999 Niles et al. 

5,946,678 A 8/1999 Aalbersgerg 707/3 

5,950,187 A 9/1999 Tsuda 707/3 

5,987,454 A U/1999 Hobbs 

6,006,218 A 12/1999 Breese et al 707/3 

6,021,403 A 2/2000 Horvitz et al 706/45 

6,026,409 A 2/2000 Blumenthal 

6,028,601 A 2/2000 Machiraju et al 345/336 

6,055,542 A 4/2000 Nielsen et al. 

6,094,648 A * 7/2000 Aalbersberg 707/3 

6,101,503 A 8/2000 Cooper et al. 

6,182,090 Bl 1/2001 Peairs 



OTHER PUBLICATIONS 

Begole et al., "Flexible Collaboration Transparency," doc. 
ID: ncstrl,vatech_sc/TR-98-ll, Virginia Polytechnic 
Instititue and State University (1998). 
Byrd, D., "A Scrollbar-based Visualization for Document 
Navigation,"doc.ID: xxx.cs.IR/9902028, Computing 
Research Ropository: Information Retrieval (1999). 
"Flexible JAMM Screenshots," downloaded from internet 
site http://simon.cs.et.edu/-jamm May 4, 2000. 
Taghva, K. et al., "An Evaluation of an Automatic Markup 
System," Proceedings of the SPIE, vol. 2422, pp. 317-327, 
USA 1995. 

Schweighofer, E. et al., "The Automatic Generation of 
Hypertext Links/'Database and Expert Systems Applica- 
tions, 7th Int*l. Conference, DEXA '96 Proceedings, pp. 
889-898, Springer-Verlag, Berlin, Germany. 1996. 
Sumita, K. et al., "Document Structure Extraction for Inter- 
active Document Retrieval Systems," Conference Proceed- 
ings, SIGDOC'93, 11th Annual Int'l. Conference, pp. 
301-310, New York NY 1993. 

Brandow, R., "Automatic Condensation of Electronic Pub- 
lications by Sentence Selection," Information Processing & 
Management, vol. 31, No. 5, pp. 675-685, 1995. 
Apple Computer, Inc., "Getting help, and "Turning Balloon 
Help," Macintosh Data Book, Reference 7 System, in Chap- 
ter 1 entitled "A review of Standard Macintosh Operations," 
pp. 30-31 (1991). 



Manber, Udi, "The Use of Customized Emphasis in Text 
Visualization," Proceedings 1997 IEEE Conference on 
Information Visualization, (Preliminary Version), An Inter- 
national Conference on Computer Visualization & Graphics, 
Aug. 27-29, 1997, London, England, pp. 132-138, (Aug. 
27-29, 1997). 

Adobe Systems, Inc., "Adobe Photoshop 4.0 User Guide for 
Macintosh and Windows," 1996, Title Page, Copyright 
Page, Chap. 2, pp. 30-31. 

Gliedman, J., "Virtual Office Managers," Computer Shop- 
per, vol. 18, No. 9, pp. 290-294, (Sep. 1998). 
Hill, W.C., and Hollanm J.D., "Edit Wear and Read Wear," 
Computer Graphics and Interactive media Research Group, 
pp. 3-9, (May, 1992). 

Langley, P., and Sage, S., "Induction of Selective Bayesian 
Classifiers," Proc. of the Tenth Conference on Uncertainty in 
Artificial Intelligence, Seattle, WA, pp. 399-^06 (1994). 
Langley, P., IBA, W, and Thompson, K., "An Analysis of 
Bayesian Classifiers," Proc. of the Tenth National Confer- 
ence on Artificial Intelligence, San Jose, CA, pp. 223-228 
(1992). 

Taxt, T, Flynn, PJ. and Jain, A.K., "Segmentation of Docu- 
ment Images," IEEE Transactions on Pattern Analysis and 
Machine Intelligence, vol. 11, No. 12, pp. 1322-1329, (Dec. 
1989). 

Adobe Acrobat Reader 3.0 screen dumps (fig. 1-3), (1996). 

Ball, Thomas, and Eick, Stephen G., "Software Visualiza- 
tion in the Large," IEEE Computer, vol. 29, No. 4, Apr. 

1996, pp. 33-43, http://www.computer.org/computer/ 
col996/r4033abs.htm. 

Boguraev et al., "Salience-Based Content Characterization 
of Text Documents," Proceedings of the ACL/EACL Work- 
shop on Intellegent [Sic] Scalable Text Summarization, 

1997, Topic identification, Discourse-based summarization, 
pp. 1-12. 

Greenberg, et al., "Sharing fisheye views in relaxed-WYSI- 
WIS groupware applications," Proceedings of Graphics 
Interface, Toronto, Canada, May 22-24, 1995, Distributed 
by Morgan-Kaufmann, pp. 28-38, http://www.cpsc.ucal- 
gary.ca/grouplab/papers/1996/96-Fisheye.GI/gi96 
fisheye.html. 

Hearst et al., "TileBars: Visualization of Term Distribution 
Information in Full Text Information Access," Proceedings 
of the ACM SIGCHI Conference on Human Factors in 
Computing Systems(CHI), Denver, CO., May 1995, pp. 1-8, 
http://svww.acm.org/sigchi/chi95/ElectroruVdocurnnts/pa- 
pers/mah bdy htm. 

* cited by examiner 



06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 1 of 14 



US 6,457,026 Bl 




CD 



06/10/2004, EAST Version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 2 of 14 



US 6,457,026 Bl 




06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 3 of 14 



US 6,457,026 Bl 




06/10/2004, east version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 4 of 14 



US 6,457,026 Bl 




06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 5 of 14 



US 6,457,026 Bl 




06/10/2004, EAST Version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 6 of 14 US 6,457,026 Bl 



: ReadeisHriljHir. QiMnp-lira I niat ciwtl i«n HetrievaJ 




Ms 



F/G.3 



06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 7 of 14 



US 6,457,026 Bl 



: fleadetsHelper Queiy lie« I nfoi [Italian flrtrievd 




FIG. 4 



06/10/2004, EAST Version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 8 of 14 



US 6,457,026 Bl 



cnA A USER 
504 -S\ — 



508 

A. 



506- 



DOC 
BROWSER 



ANNOTATION | OUTPUT: FORMATTED TEXT FILE 
AGENT ■ ' . 



514 



FORMATTING 



512- 



OUTPUT: ANNOTATED TEXT STREAM 



CONTENT RECOGNITION 



510- 



OUTPUT: PARSED TEXT STREAM 



TEXT PROCESSING 




RAW TEXT INPUT (E.G. FROM 
INTERNET OR OCR'ED DOCUMENTS) 



FIG. 5 



PROFILE 
EDITOR 



518 



USER 
PROFILE 



516 



510 

A. 



TEXT 

PROCESSING 



504- 



506- 



USER y 



DOC 
BROWSER 



514 



•HTML 

• LaTeX 

• Postscript 

• OCRed 



FORMATTING 



512- 



CONTENT RECOGNITION 



PARSE TEXT 



LANGUAGE PROCESSING 



UPDATE HISTORY 
READ FILE 



I 



UPDATING 



FILE I/O 



-614 616 
-612 



PROFILE 
EDITOR 

\ 

518 



USER 
PROFILE 



ACCESS/ 
UPDATE 
HISTORY 



516 



602 

RESOURCE CHANNEL ^ 



HISTORY 
618 




FIG. 6A 



06/10/2004, EAST Version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 9 of 14 US 6,457,026 Bl 



504 



506 



USER \ 



518 

PROFILE 
EDITOR 



DOC 
BROWSER 



512 



514- 



•HTML 

• LaTeX 

• Postscript 
• OCR^d 



FORMATTING 



CONTENT 
RECOGNITION 



624 



UPDATE PROFILE 



622 



ADD ANNOTATION TAGS 



620- 



MATCH PATTERNS 



RECORD 
HISTORY 
OF MATCHES 



516 



USER 
PROFILE 



SEARCH 
FOR USER 
CONCEPTS 




FIG. 6B 



06/10/2004, EAST Version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 10 of 14 



US 6,457,026 Bl 



508 



I 



602 



504 



506- 



^ 4 USER ] 



514 

A 



DOC 
BROWSER 



FORMATTING 
626- 



HTML 
LaTeX 
Postscript 
OCR'ed 



RENDER TEXT 



512- 



510 



CONTENT RECOGNITION 



TEXT PROCESSING 



RESOURCE 
CHANNEL 



518 

.A 



PROFILE 
EDITOR 



516 



USER 
PROFILE 



HISTORY 



V 

618 



NETWORK 




FIG. 6C 



700 




FIG. 7 



06/10/2004, EAST Version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 11 of 14 US 6,457,026 Bl 




06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 12 of 14 US 6,457,026 Bl 



: ReadeJsHelpet: Quejp-liee fnffun><rtian R^trrnvftl 
"' FISWEWSSfilWWf " 




F/G. 



06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 13 of 14 US 6,457,026 Bl 




FIG. 9B 



06/10/2004, EAST version: 1.4.1 



U.S. Patent Sep. 24, 2002 Sheet 14 of 14 US 6,457,026 Bl 



1002 

1006 J 
I <RH . ANOH . S NUMBERS 10 ^ 8 1008 

I We have approached this challenge by introducing Jan 

^<BH.ANOH C^NCEPT^ Intelligent Agents'/ SUBCONCEP«f=" intelligent agent" SEN- 
TENCED' 4" NUMBEEt=l>intelligent agent/ </RH.ANOH> that analyzes interactions 
between user ancKRH.ANOH C0NCEPT="Bayes Inference" SUBCGNCEPT^ expert system" 
SENTENCE=" 4 " NUMBER>=3>expert system <RH.ANOH> and automatically constructs 
database queries based on this ana 1 ysis</RH. ANOH. S>. The user is unobtrusively 
notified when information relevant to the current diagnostic context has been 
returned, and may immediately access it if desiredJ From the user's perspec- 
tive all database machinery is entirely transparent; indeed no formal query 
language is even made available. Hence we term thks approach query-free infor- 
mation retrieval. <p> <if\r\A 
1006 ^ 10Q2 1004 

<RH.ANOH.S N0MBER=5>- / " 1008 

As we hope will be apparent from what follows, thfe introduction of the 
<RH.ANOH O0NCEPT="Intelligent Agents" SUBCGNCEPfl^'intelligent agent" SEN- 
TENCE'S" N0MBERp2>intelligent agent </RH.ANOH> additionally offers one solu- 
tion to a fundamental problem facing designers of cooperative information 
systems: How can legacy systems of substantial complexity be integrated within 
a larger system context</RH.ANQH.S>? By requiring that all interactions with 
the legacy database be mediated byVthe agent, we have been able to isolate the 
database system cleanly while still supporting query-free information 
retrieval. <p> 4f\e\A 

1006 ^ 1Q02 1004 

<RH.ANOH.S NDMBERr=6>- / " 1008 



FIXIT is comprised of the three subsystems already! mentioned: the probabilistic 
<RH.ANOH CONCEPT="Bayes Inference" SUBCCNCEPT=" expert system" SENTENCE="6" 
NUMBER=4>expert system <RH.ANOH> / the legacy farf-text database system (to 
which we added a new, semantical ly-based, indexing structure that supports lim- 
ited <RH.ANQH CONCBPT^Iatural Language" SUBCXJNCEPT=" natural language" SEN- 
TENCED" 6" Nl3MBER?=a>natural language <RH.ANQH> queries), and the <RH.AN0H CON=^ 
CEPT="Intelligent Agents" SUBCONCEPT^ intelligent agent" SENTENCES" 6" NUM- 1Q06 
B£ft=3>intelligent agent </RH.ANQH> that effectively integrates 
them</RH.ANOH.S>. The following sections describe these system components, pro- 
vide ^implementation details, illusti^te the runtime behavior of FIXIT, report 
on operational experience, and close\fith some observations about query- free 
information retrieval and the potential for generalizing the underlying para- 
digm.< P N_ 1004 1008 

<h2> FIXIT s System Components</h2> 

We first describe the probabilistic expert sub-system and the information 
retrieval sub-system. Before briefly describing these, we stress that our pur- 
pose was not necessarily to advance the capabilities of the individual compo- 
nents or indeed even to exploit fully the best current technology; instead, we 
focus on their integration. <p> 
<p> 

FIG. 10 
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SYSTEM TO FACILITATE READING A 
DOCUMENT 

The present application is a divisional application of and 
claims priority from U.S. patent application Ser. No. 08/995, 5 
616 filed Dec. 22, 1997 pending the entire contents of which 
are herein incorporated by reference for all purposes. 

BACKGROUND OF THE INVENTION 

The present invention relates to display of electronic 10 
documents and more particularly to method and apparatus 
for augmenting electronic document display with features to 
enhance the experience of reading an electronic document 
on a display. 35 

Increasingly, readers of documents are being called upon 
to assimilate vast quantities of information in a short period 
of time. To meet the demands placed upon them, readers find 
they must read documents "horizontally," rather than 
"vertically," i.e., they must scan, skim, and browse sections 2 o 
of interest in multiple documents rather than read and 
analyze a single document from beginning to end. 

Documents are now more and more available in electronic 
form. Some documents are available electronically by virtue 
of their having been locally created using word processing 25 
software. Other electronic documents are accessible via the 
Internet. Yet others may become available in electronic form 
by virtue of being scanned in, copied, or faxed. See com- 
monly assigned U.S. Pat. No. 5,978,477, entitled AUTO- 
MATIC AND TRANSPARENT DOCUMENT 30 
ARCHIVING, the contents of which are herein incorporated 
by reference. 

However, the mere availability of documents in electronic 
form does not assist the reader in confronting the challenges 
of assimilating information quickly. Indeed, many time- 35 
challenged readers still prefer paper documents because of 
their portability and the ease of flipping through pages. 

Certain tools exist to take advantage of the electronic 
form documents to assist harried readers. Tools exist to 
search for documents both on the Internet and locally. 40 
However, once the document is identified and retrieved, 
further search capabilities are limited to keyword searching. 
Automatic summarization techniques have also been devel- 
oped but have limitations in that they are not personalized. 
They summarize based on general features found in sen- 45 
tences. 

What is needed is a document display system that helps 
the reader mid as well as assimilate the information he or she 
wants more quickly. The document display system should be 5Q 
easily personalizable and flexible as well. 

SUMMARY OF THE INVENTION 

An automatic reading assistance application for docu- 
ments in electronic form is provided by virtue of the present 55 
invention. In certain embodiments, an automatic annotator is 
provided which finds concepts of interest and keywords. The 
operation of the annotator is personalizable for a particular 
user. The annotator is also capable of improving its perfor- 
mance overtime by both automatic and manual feedback. 60 
The annotator is usable with any electronic document. 
Another available feature is a elongated thumbnail image of 
all or part of a multi-page document wherein a currently 
displayed section of the document is emphasized in the 
elongated thumbnail image. Movement of the emphasized 65 
area in the elongated thumbnail image is then synchronized 
with scrolling through the document. 
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In accordance with a first aspect of the present invention, 
a method for annotating an electronically stored document 
includes steps of: accepting user input indicating user- 
specific concepts of interest, analyzing the electronic docu- 
ment to identify locations of discussion of the user-specific 
concepts of interest, and displaying the electronic document 
with visual indications of the identified locations. 

In accordance with a second aspect of the present 
invention, a method for displaying a multi-page document 
includes steps of: displaying a elongated thumbnail image of 
a multi-page document in a first viewing area of a display, 
displaying a section of the multi-page document in a second 
viewing area of the display in legible form, emphasizing an 
area of the elongated thumbnail image corresponding to the 
section displayed in the second viewing area, accepting user 
input controlling sliding of the emphasized area through the 
thumbnail image, and scrolling the displayed section 
through the second viewing area responsive to the scrolling 
so that the emphasized area continues to correspond to the 
displayed section. 

A further understanding of the nature and advantages of 
the inventions herein may be realized by reference to the 
remaining portions of the specification and the attached 
drawings. 

BRIEF DESCRIPTION OF THE DRAWINGS 

FIG. 1 depicts a representative computer system suitable 
for implementing the present invention. 

FIGS. 2A-2D depict document browsing displays in 
accordance with one embodiment of the present invention. 

FIG. 3 depicts a document summary view in accordance 
with one embodiment of the present invention. 

FIG. 4 depicts a table of contents view in accordance with 
one embodiment of the present invention. 

FIG. 5 depicts a top-level software architectural diagram 
for automatic annotation in accordance with one embodi- 
ment of the present invention. 

FIGS. 6A-6C depict a detailed software architectural 
diagram for automatic annotation in accordance with one 
embodiment of the present invention. 

FIG. 7 depicts a representative Bayesian belief network 
useful in automatic annotation in accordance with one 
embodiment of the present invention. 

FIG. 8 depicts a user interface for defining a user profile 
in accordance with one embodiment of the present inven- 
tion. 

FIGS. 9A-9B depict an interface for providing user 
feedback in accordance with one embodiment of the present 
invention. 

FIG. 10 depicts a portion of an HTML document pro- 
cessed in accordance with one embodiment of the present 
invention. 

DESCRIPTION OF SPECIFIC EMBODIMENTS 
Computer System Usable for Implementing the Present 
Invention 

FIG. 1 depicts a representative computer system suitable 
for implementing the present invention. FIG. 1 shows basic 
subsystems of a computer system 10 suitable for use with the 
present invention. In FIG. 1, computer system 10 includes a 
bus 12 which interconnects major subsystems such as a 
central processor 14, a system memory 16, an input/output 
controller 18, an external device such as a printer 20 via a 
parallel port 22, a display screen 24 via a display adapter 26, 
a serial port 28, a keyboard 30, a fixed disk drive 32 and a 
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floppy disk drive 53 operative to receive a floppy disk 33A. 
Many other devices may be connected such as a scanner 34 
via I/O controller 18, a mouse 36 connected to serial port 28 
or a network interface 40. Many other devices or subsystems 
(not shown) may be connected in a similar manner. Also, it 5 
is not necessary for all of the devices shown in FIG. 1 to be 
present to practice the present invention, as discussed below. 
The devices and subsystems may be interconnected in 
different ways from that shown in FIG. 1. The operation of 
a computer system such as that shown in FIG. 1 is readily 10 
known in the art and is not discussed in detail in the present 
application. Source code to implement the present invention 
may be operably disposed in system memory 16 or stored on 
storage media such as a fixed disk 32 or a floppy disk 33A. 
Image information may be stored on fixed disk 32. is 
Annotated Document User Interface 

The present invention provides a personalizable system 
for automatically annotating documents to locate concepts 
of interest to a particular user. FIG. 2A depicts one user 
interface 200 for viewing a document that has been anno- 20 
tated in accordance with the present invention. A first 
viewing area 202 shows a section of an electronic document. 
Using a scroll bar 204, or in other ways, the user may scroll 
the displayed section through the electronic document. 

A series of concept check boxes 206 permit the user to 25 
select which concepts of interest are to be noted in the 
document. A sensitivity control 208 permits the user to select 
the degree of sensitivity to apply in identifying potential 
locations of relevant discussion. At low sensitivity, more 
locations will be denoted as being relevant, even though 30 
some may not be of any actual interest. At high sensitivity, 
most all denoted locations will in fact be relevant but some 
other relevant locations may be missed. After each concept 
name appearing by one of checkboxes 206 appears a per- 
centage giving the relevance of the currently viewed docu- 35 
ment to the concept. These relevance levels offer a quick 
assessment of the relevance of the document to the selected 
concepts. FIG. 2A shows no annotations because a plain text 
view rather than an annotated view has been selected for first 
viewing area 202. 40 

A thumbnail view 214 of the entire document is found in 
a second viewing area 215. Details of thumbnail view 214 
will be discussed in greater detail below. 

Miscellaneous navigation tools are found on a navigation 
toolbar 216. Miscellaneous annotation tools are found on an 45 
annotation toolbar 218. The annotation tools on annotation 
toolbar 218 facilitate navigation through a collection of 
documents. 

According to the present invention, annotations may be 
added to the text displayed in first viewing area 202. The 50 
annotations denote text relevant to user-selected concepts. 
As will be explained further below, an automatic annotation 
system according to the present invention adds these anno- 
tations to any document available in electronic form. The 
document need not include any special information to assist 55 
in locating discussion of concepts of interest. 

FIG. 2B depicts the document view of FIG. 2Abut with 
annotation added in first viewing area 202. Phrases 220 have 
been highlighted to indicate that they relate to concepts of 
interest to the user. The highlighting is preferably color. 60 
However, for ease of illustration in black-and-white format, 
rectangles indicate the highlighted areas of text. For further 
emphasis, the highlighted text is preferably printed in bold. 
A rectangular bar 222 indicates a paragraph that has been 
determined to have relevance above a predetermined thresh- 65 
old or to have more than a threshold number of key phrases. 
Rectangular bar 222 is merely representative of various 
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forms of marginal annotation that might be used to indicate 
a relevant section of the text 

FIG. 2C depicts an alternative style of annotation. Now in 
first viewing area 202, entire sentences 224 including 
phrases relevant to concepts of interest are highlighted. The 
phrases themselves are printed in bold text. It has been found 
that highlighting the entire sentence rather than just a 
relevant phrase provides the user with far more information 
at a glance. 

FIG. 2D depicts how further information about key 
phrases may be displayed. The user may select any high- 
lighted key phrase with the mouse. Upon selection of the key 
phrase, a balloon 226 appears. The balloon includes further 
information relevant to the key phrase. For example, the 
balloon may include the name of the concept to which the 
keyword is relevant. The balloon may also include biblio- 
graphic information if the key phrase includes a citation. 

FIG. 3 depicts a document summary view in accordance 
with one embodiment of the present invention. The user may 
optionally select a summary view 300 of the document. 
Summary view lists the concepts of interest 302 that are 
found in the documents as headings of an outline. For each 
concept, keywords or key phrases 304 are listed which are 
indicative of the concept of interest. A number in parenthesis 
by each keyword indicates the number of times the keyword 
or key phrase appears. Each concept also has an associated 
score 306 indicative of the relevance of the whole document 
to the concept. 

FIG. 4 depicts a table of contents view in accordance with 
one embodiment of the present invention. An alternative to 
summary view 300 is a table of contents view 400. Table of 
contents view 400 lists major headings 402 and subheadings 
403 of the electronic document. By selecting one of hierar- 
chical display icons 404, the user may list the concepts 406 
found under one of the document headings 402 or subhead- 
ings 403 with an indication of relevance for each concept 
and the number of keywords found. There is also a relevance 
meter 408 for each document heading 402 that indicates the 
overall relevance of the text under that heading for all of the 
currently selected concepts. In a preferred embodiment 
where the document is an HTML document, to create 
table-of-contents view 400, the headings of the document 
are identified by an analysis of the HTML heading tags. 
Automatic Annotation Software 

FIG. 5 depicts a top-level software architectural diagram 
for automatic annotation in accordance with one embodi- 
ment of the present invention. A document 502 exists in 
electronic form. It may have been scanned in originally. It 
may be, e.g., in HTML, Postscript, LaTeX, other word 
processing or e-mail formats, etc. The description that 
follows assumes an HTML format. A user 504 accesses 
document 502 through a document browser 506 and an 
annotation agent 508. Document browser 506 is preferably 
a hypertext browsing program such as Netscape Navigator 
or Microsoft Explorer but also may be, e.g., a conventional 
word processing program. 

Annotation agent 508 adds the annotations to document 
502 to prepare it for viewing by document browser 506. 
Processing by annotation agent 508 may be understood to be 
in three stages, a text processing stage 510, a content 
recognition stage 512, and a formatting stage 514. The input 
to text processing stage 510 is raw text. The output from text 
processing stage 510 and input to content recognition stage 
512 is a parsed text stream, a text stream with formatting 
information such as special tags around particular words or 
phrases removed. The output from content recognition stage 
512 and input to formatting stage 514 is an annotated text 
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stream. The output of formatting stage 514 is a formatted pattern identification stage 622. A first oval 702 represents a 
text file viewable with document browser 506. particular user-specified concept of interest. Other ovals 704 
The processing of annotation agent 508 is preferably a represent subconcepts related to the concept identified by 
run-time process. The annotations are not preferably pre- 0 val 702. Each line between one of subconcept ovals 704 
inserted into the text but are rather generated when user 504 5 an d concept oval 702 indicates that discussion of the sub- 
requests document 502 for browsing. Thus, this is preferably concept implies discussion of the concept. Each connection 
a dynamic process. Annotation agent 508 may also, between one of subconcept ovals 704 and concept oval 702 
however, operate in the background as a batch process. has an pro bability value indicated in percent. 

"notation added by annotation agent 508 depends on ^ yalues fa ^ mdicale me babilit mat me t 

concepts of interest selected by user 504 User 504 also fa ^ c / of evide / ce mdicatin th F e 

mputs information used by annotation agent 508 to identify r *u u * rv c.u u . • 

locations of discussion of concepts of interest in document F rcsenoe ° f subconcept. Discussion ofuie subconcept is 

502. In a preferred embodiment, this information defines the m mr ° ^cheated by one or more keywords or key phrases 

structure of a Bayesian belief network. The concepts of (not shown in FIG 7). 

interest and other user-specific information are maintained The structure of Bayesian belief network 700 is only one 

in a user profile file 516. User 504 employs a profile editor 15 possible structure applicable to the present invention. For 

518 to modify the contents of user profile file 516. example, one could employ a Bayesian belief network with 

FIG. 6A depicts the automatic annotation software archi- more than two levels of hierarchy so that the presence of 

tecture of FIG. 5 with text processing stage 510 shown in subconcepts is suggested by the presence of "subsubcon- 

greater detail. FIG. 6A shows that the source of document cepts" and so on. In the preferred embodiment, presence of 

502 may be accessed via a network 602. Possible sources 20 a keyword or key phrase always indicates presence of 

include e.g., the Internet 604, an intranet 606, a digital copier discussion of the subconcept but it is also possible to 

608 that captures document images, or other office equip- configure the belief network so that presence of a keyword 

ment 610 such as a fax machine, scanner, printer, etc. or key phrase suggests discussion of the subconcept with a 

Another alternative source is the user's own hard drive 32. specified probability. 

Text processing stage 510 includes a file I/O stage 612, an 25 The primary source for the structure of Bayesian belief 

updating stage 614, and a language processing stage 616. network 700 including the selection of concepts, keywords 

File I/O stage reads the document file from network 602. and key phrases, interconnections, and probabilities is user 

Updating stage 614 maintains a history of recently visited profile file 516. In a preferred embodiment, user profile file 

documents in a history file 618. Language processing stage 516 is selectable for both editing and use from among 

616 parses the text of document 502 to generate the parsed 30 profiles for many users. 

text output of text processing stage 510. The structure of belief system 700 is however also modi- 
FIG. 6B depicts the automatic annotation software archi- fiable during use of the annotation system. The modifica- 
tecture of FIG. 5 with content recognition stage 512 shown tions may occur automatically in the background or may 
in greater detail. A pattern identification stage 620 looks for involve explicit user feedback input. The locations of con- 
particular patterns in the parsed text output of text process- 35 cepts of interest determined by pattern identification stage 
ing stage 510. The particular patterns searched for are 620 are monitored by profile updating stage 624. Profile 
determined by the contents of user profile file 516. Once the updating stage 624 notes the proximity of other keywords 
patterns are found, annotation tags are added to the parsed and key phrases within each analyzed document to the 
text by an annotation tag addition stage 622 to indicate the locations of concepts of interest. If particular keywords and 
pattern locations. In a preferred HTML embodiment, these 40 key phrases are always near a concept of interest, the 
annotation tags are compatible with the HTML format. structure and contents of belief system 700 are updated in 
However, the tagging process may be adapted to LaTeX, the background without user input by profile updating stage 
Postscript, etc. A profile updating stage 624 monitors the 624. This could mean changing probability values, intro- 
output of annotation tag addition stage 622 and analyzes text ducing a new connection between a subconcept and concept, 
surrounding the locations of concepts of interest. As will be 45 or introducing a new keyword or key phrase, 
further discussed with reference to FIG. 7, profile updating User 504 may select a word or phrase in document 502 as 
stage 624; changes the contents of user profile file 516 based being relevant to a particular concept even though the word 
on the analysis of this surrounding text. The effect is to or phrase has not yet defined to be a keyword or key phrase, 
automatically refine the patterns searched for by pattern Belief system 700 is then updated to include the new 
identification stage 620 to improve annotation performance. 50 keyword or key phrase 

FIG. 6C depicts the automatic annotation software archi- User 504 may also give feedback for an existing key word 
tecture of FIG. 5 with formatting stage 514 shown in greater or key phrase, indicating the perceived relevance of the 
detail. Formatting stage 514 includes a text rendering stage keyword or key phrase to the concept of interest. If the 
626 that formats the annotated text provided by content selected keyword or key phrase is indicated to be of high 
recognition stage 512 to facilitate viewing by document 55 relevance to the concept of interest, the probability values 
browser 506. An HTML document as modified by format- connecting the subconcept indicated by the selected key- 
ting stage 514 is discussed in greater detail with reference to words or key phrases to the concept of interest increases. If, 
FIG. 10. on the other hand, user 504 indicates the selected keywords 
Pattern identification stage 620 looks for keywords and or key phrases to be of little interest, the probability values 
key phrases of interest and locates relevant discussion of 60 connecting these keywords or key phrases to the concept 
concepts based on the located keywords. The identification decrease. 

of keywords and the application of the keywords to locating User Profile and Feedback Interfaces 

relevant discussion is preferably accomplished by reference FIG. 8 depicts a user interface for defining a user profile 

to a belief system. The belief system is preferably a Bayesian in accordance with one embodiment of the present inven- 

belief network. 65 lion. User interface screen 800 is provided by profile editor 

FIG. 7 depicts a portion of a representative Bayesian 518. A profile name box 802 permits the user to enter the 

belief network 700 implementing a belief system as used by name of the person or group to whom the profile to be edited 
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is assigned. This permits the annotation system according to 
the present invention to be personalized to particular users or 
groups. A password box 804 provides security by requiring 
entry of a correct password prior to profile editing opera- 
tions. 5 

A defined concepts list 806 lists all of the concepts which 
have already been added to the user profile. By selecting a 
concept add button 808, the user may add a new concept. By 
selecting a concept edit button 810, the user may modify the 
belief network as it pertains to the listed concept that is 10 
currently selected. By selecting a remove button 812, the 
user may delete a concept. 

If a concept has been selected for editing, its name 
appears in a concept name box 813. The portion of the belief 
network pertaining to the selected concept is shown in a is 
belief network display window 814. Belief network display 
window 814 shows the selected concept, the subconcepts 
which have been defined as relating to the selected concept 
and the percentage values associated with each relationship. 
The user may add a subconcept by selecting a subconcept 20 
add button 815. The user may edit a subconcept by selecting 
the subconcept in belief network display window 814 and 
then selecting a subconcept edit button 816. A subconcept 
remove button 818 permits the user to delete a subconcept 
from the belief network. 25 

Selecting subconcept add button 815 causes a subconcept 
add window 820 to appear. Subconcept add window 820 
includes a subconcept name box 822 for entering the name 
of a new subconcept. A slider control 824 permits the user 
to select the percentage value that defines the probability of 30 
the selected concept appearing given that the newly selected 
subconcept appears. A keyword list 826 lists the keywords 
and key phrases which indicate discussion of the subcon- 
cept. The user adds to the list by selecting a keyword add 
button 828 which causes display of a dialog box (not shown) 35 
for entering the new keyword or key phrase. The user deletes 
a keyword or key phrase by selecting it and then selecting a 
keyword delete button 830. Once the user has finished 
defining the new subconcept, he or she confirms the defi- 
nition by selecting an OK button 832. Selection of a cancel 40 
button 834 dismisses subconcept add window 820 without 
affecting the belief network contents or structure. Selection 
of subconcept edit button 816 causes display of a window 
similar to subconcept add window 820 permitting redefini- 
tion of the selected subconcept. 45 

By selecting whether a background learning checkbox 
836 has been selected, the user may enable or disable the 
operation of profile updating stage 624. A web autofetch 
check box 838 permits the user to select whether or not to 
enable an automatic web search process. When this web 50 
search process is enabled, whenever a particular keyword or 
key phrase is found frequently near where a defined concept 
is determined to be discussed, a web search tool such as Alia 
Vista™ is employed to look on the World Wide Web for 
documents containing the keyword or key phrase. A thresh- 55 
old slider control 840 is provided to enable the user to set a 
threshold relevance level for this autofetching process. 

FIGS. 9A-9B depict a user interface for providing feed- 
back in accordance with one embodiment of the present 
invention. User 504 may select any text and call up a first 60 
feedback window 902. The text may or may not have been 
previously identified by the annotation system as relevant. In 
first feedback window 902 shown in FIG. 9 A, user 504 may 
indicate the concept to which the selected text is relevant. 
First feedback window 902 may not be necessary when 65 
adjusting the relevance level for a keyword or key phrase 
that is already a part of belief network 700. After the user 



selects a concept in first feedback window 902, a second 
feedback window 904 is displayed for selecting the degree 
of relevance. Second feedback window 904 in FIG. 9B 
provides three choices for level of relevance: good, medium 
(not sure), and bad. Alternatively, a slider control could be 
used to set the level of relevance. If the selected text is not 
already a keyword or key phrase in belief network 700, a 
new subconcept is added along with the associated new 
keyword or key phrase. If the selected text is already a 
keyword or key phrase, above, probability values within 
belief system 700 are modified appropriately in response to 
this user feedback. 

FIG. 10 depicts a portion of an HTML document 1000 
processed in accordance with one embodiment of the present 
invention. A sentence including relevant text is preceded by 
an a <RH.ANOH.S . . . > tag 1002 and followed by an 
</RH.ANOH.S > tag 1004. The use of these tags facilitates 
the annotation mode where complete sentences are high- 
lighted. The <RH.ANOH.S . . . > tag 1002 includes a number 
indicating which relevant sentence is tagged in order of 
appearance in the document. Relevant text within a 
so-tagged relevant sentence is preceded by an 
<RH.ANOH . . . > tag 1006 and followed by an 
</RH.ANOH> tag 1008. The <RH.ANOH . . . > 1006 tag 
include the names of the concept and subconcept to which 
the annotated text is relevant, an identifier indicating which 
relevant sentence the text is in and a number which identifies 
which annotation this is in sequence for a particular concept. 
An HTML browser that has not been modified to interpret 
the special annotation tags provided by the present invention 
will ignore them and display the document without annota- 
tions. 

Thumbnail Image Display 

Referring again to FIGS. 2A-2D, an elongated thumbnail 
image 214 of many pages, or all of document 502 is 
presented in second viewing area 215. Document 502 will 
typically be a multi-page document with a section being 
displayed in first viewing area 202. Elongated thumbnail 
image 214 provides a convenient view of the basic docu- 
ment structure. The annotations incorporated into the docu- 
ment are visible within elongated thumbnail image 214. 
Within elongated thumbnail image 214, an emphasized area 
214A shows a reduced view of the document section cur- 
rently displayed in first viewing area 215 with the reduction 
ratio preferably being user-configurable. Thus, if the first 
viewing area 202 changes in size because of a change of 
window size, emphasized area 214A will also change in size 
accordingly. The greater the viewing area allocated to elon- 
gated thumbnail image 214 and emphasized area 214A, the 
more detail is visible. With very small allocated viewing 
areas, only sections of the document may be distinguishable. 
As the allocated area increases, individual lines and even- 
tually individual words become distinguishable. In FIGS. 
2A-2D the user-configured ratio is approximately 5:1. 
Emphasized viewing area 214A may be understood to be a 
lens or a viewing window over the part of elongated thumb- 
nail image 214A corresponded to the document section 
displayed in first viewing area 215. User 504 may scroll 
through document 502 by sliding emphasized area 214A up 
and down. As emphasized area 2 14 A shifts, the section of 
document 502 displayed in first viewing area 202 will also 
shift. User 504 may also scroll conventionally using scroll 
bar 204 or arrow keys and emphasized area 214A will slide 
up or down as appropriate in response. 

In FIGS. 2A-2C elongated thumbnail image 214 displays 
each page of document 502 as being displayed at the same 
reduced scale. The present invention also contemplates other 
modes of scaling elongated thumbnail image 214. For 
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example, one may display emphasized area 214A at a scale 
similar to that shown in FIGS. 2A-2C and use a variable 
scale for the rest of elongated thumbnail image 214. Text 
from far away emphasized area 214A would be displayed at 
a highly reduced scale and the degree of magnification 5 
would increase with nearness to emphasized area 2 14 A. 

Because, the annotations appear in enlongated thumbnail 
image 214, it is very easy to find relevant text anywhere in 
document 502. Furthermore, elongated thumbnail image 
214 provides a highly useful way of keeping track of one's 10 
position within a lengthy document. 
Software Implementation 

In a preferred embodiment, software to implement the 
present invention is written in the Java language. Preferably, 
the software forms a part of a stand-alone browser program 15 
written in the Java language. Alternatively, the code may be 
in the form of a so-called "plug-in** operating with a Java- 
equipped web browser used to browse HTML documents 
including the special annotation tags explained above. 

In the foregoing specification, the invention has been 20 
described with reference to specific exemplary embodiments 
thereof. For example, any probabilistic inference method 
may be substituted for a Bayesian belief network. It will, 
however, be evident that various modifications and changes 
may be made thereunto without departing from the broader 25 
spirit and scope of the invention as set forth in the appended 
claims and their full scope of equivalents. 

What is claimed is: 

1. A computer- implemented method for annotating an 
electronically stored document comprising: 30 

storing first information identifying a plurality of concepts 
and one or more keywords associated with each con- 
cept in said plurality of concepts; 

receiving user input indicating selection of a set of one or 
more concepts from said plurality of concepts; 35 

identifying, from said first information, one or more 
keywords associated with each concept in said set of 
concepts; 

searching said electronic document to identify locations 
of said keywords associated with concepts in said set of 40 
concepts in said electronic document; 

displaying said electronic document with visual indica- 
tions of said identified locations; and 

displaying a relevance indicator for each concept in said 45 
set of concepts, said relevance indicator indicating 
relevance of said document to said concept. 

2. The method of claim 1 wherein searching said elec- 
tronic document comprises exploiting a probabilistic infer- 
ence method to identify said locations. 5Q 

3. The method of claim 2 wherein said probabilistic 
inference method comprises a Bayesian belief network. 

4. The method of claim 3 further comprising: 
accepting user input defining a structure of said Bayesian 

belief network. 55 

5. The method of claim 4 further comprising: 
modifying said Bayesian belief network in accordance 

with content of previously visited electronic docu- 
ments. 

6. The method of claim 3 further comprising: 60 
accepting user input indicating a degree of relation 

between said locations and said concepts in said set of 
concepts; and 

modifying said Bayesian belief network responsive to 
said degree of relation. 65 

7. The method of claim 1 wherein displaying said elec- 
tronic document with visual indications comprises: 
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highlighting sections of said document surrounding said 
locations. 

8. The method of claim 1 wherein displaying said elec- 
tronic document with visual indications comprises: 

displaying a balloon pointing to a user-selected one of 
said locations, said balloon identifying a concept from 
said set of concepts to which text in said user-selected 
one of said locations is relevant. 

9. The method of claim 1 wherein displaying said elec- 
tronic document with visual indications comprises: 

displaying marginal notation identifying said locations. 

10. The method of claim 1 wherein said first information 
include a probability value associated with each keyword, 
said probability value indicating the probability of existence 
of a concept with which said keyword is associated given the 
presence of said keyword. 

11. A computer program product for annotating an elec- 
tronically stored document comprising: 

code for storing first information identifying a plurality of 
concepts and one or more keywords associated with 
each concept in said plurality of concepts; 

code for receiving user input indicating selection of a set 
of one or more concepts from said plurality of con- 
cepts; 

code for identifying, from said first information, one or 
more keywords associated with each concept in said set 
of concepts; 

code for searching said electronic document to identify 
locations of said keywords associated with concepts in 
said set of concepts in said electronic document; 

code for displaying said electronic document with visual 
indications of said identified locations; 

code for displaying a relevance indicator for each concept 
in said set of concepts, said relevance indicator indi- 
cating relevance of said document to said concept; and 

a computer-readable storage medium for storing the 
codes. 

12. The product of claim 11 wherein said code for 
searching said electronic document comprises code for 
exploiting a probabilistic inference method to identify said 
locations. 

13. The product of claim 12 wherein said probabilistic 
inference method comprises a Bayesian belief network. 

14. The product of claim 13 further comprising code for: 
accepting user input defining a structure of said Bayesian 

belief network. 

15. The product of claim 14 further comprising code for 
modifying said Bayesian belief network in accordance with 
content of said electronic document. 

16. The product of claim 15 wherein said code for 
modifying said Bayesian belief network comprises code for 
updating said Bayesian belief network in accordance with 
proximity of keywords to said identified locations. 

17. The product of claim 13 further comprising: 

code for accepting user input indicating a degree of 
relation between said locations and said concepts in 
said set of concepts; and 

code for modifying said Bayesian belief network respon- 
sive to said degree of relation. 

18. The product of claim 11 wherein said code for 
displaying said electronic document comprises code for 
highlighting said locations. 

19. The product of claim 11 wherein said code for 
displaying said electronic document comprises code for 
highlighting sections of said document surrounding said 
locations. 



06/10/2004, EAST Version: 1.4.1 



US 6,457 : 

11 

20. The product of claim 11 wherein said code for 
displaying said electronic document comprises code for 
displaying balloons pointing to said locations. 

21. The product of claim 11 wherein said code for 
displaying said electronic document comprises code for s 
displaying marginal notations identifying said locations. 

22. A computer system comprising: 
a processor; and 

a computer-readable storage medium configured to store 
first information identifying a plurality of concepts and 10 
one or more keywords associated with each concept in 
said plurality of concepts, and configured to store code 
to be executed by said processor, said code comprising: 

code for receiving user input indicating selection of a set ]5 
of one or more concepts from said plurality of con- 
cepts; 

code for identifying, from said first information, one or 
more keywords associated with concepts in said set of 
concepts; 20 

code for searching an electronic document to identify 
locations of said keywords associated with concepts in 
said set of concepts; 

code for displaying said electronic document with visual 
indications of said identified locations; and 
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code for displaying relevance indicators for said concepts 
in said set of concepts, said relevance indicators indi- 
cating relevance of said document to said concepts in 
said set of concepts. 

23. A computer-implemented method for annotating an 
electronically stored document comprising: 

storing first information identifying one or more key- 
words associated with a concept of interest; 

receiving user input indicating selection of said concept of 
interest; 

analyzing said electronic document to identify locations 
in said electronic document of said one or more key- 
words associated with said concept of interest; 

displaying said electronic document with visual indica- 
tions of said identified locations; 

displaying a marginal notation identifying said locations 
whose relevance level to said concept of interest 
exceeds a threshold relevance level value; and 

displaying a relevance indicator for said concept of 
interest, said relevance indicator indicating relevance 
of said document to said concept of interest. 

***** 
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